Bandit and covariate processes, with finite or non-denumerable set of arms

نویسندگان

چکیده

We introduce herein a new approach to nonparametric multi-armed bandit theory involving both the and covariate processes. Following Berry et al. (1997), we assume non-denumerable set of arms for process. The develop can be readily extended continuous-time processes by using ?-greedy randomization arm elimination instead dynamic allocation indices. It also carries out stochastic search with O(1) expected time nearly optimal at values in given B before applying elimination. procedure is shown attain asymptotically minimal rates regret over B.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Denumerable Constrained Markov Decision Processes and Finite Approximations

The purpose of this paper is two fold. First to establish the Theory of discounted constrained Markov Decision Processes with a countable state and action spaces with general multi-chain structure. Second, to introduce nite approximation methods. We deene the occupation measures and obtain properties of the set of all achievable occupation measures under the diierent admissible policies. We est...

متن کامل

Multi-armed Bandit Problems with Strategic Arms

We study a strategic version of the multi-armed bandit problem, where each arm is an individual strategic agent and we, the principal, pull one arm each round. When pulled, the arm receives some private reward va and can choose an amount xa to pass on to the principal (keeping va−xa for itself). All non-pulled arms get reward 0. Each strategic arm tries to maximize its own utility over the cour...

متن کامل

Denumerable Constrained Markov Decision Problems and Finite Approximations Denumerable Constrained Markov Decision Problems and Finite Approximations

The purpose of this paper is two fold. First to establish the Theory of discounted constrained Markov Decision Processes with a countable state and action spaces with general multi-chain structure. Second, to introduce nite approximation methods. We deene the occupation measures and obtain properties of the set of all achievable occupation measures under the diierent admissible policies. We est...

متن کامل

On Non-denumerable Graphs

PROOF. We shall first prove that every complete graph of power t$i can be split up into the countable sum of trees. Let G be a complete graph of cardinal number ML Let {xa}, a<coi, be any well ordered set of power fc$i. We may assume that G is represented by a system of Segments (xa, Xp), a</3<coi. For any /3<coi arrange the set of all a< /3 into a sequence ap,n, n — \, 2, • • • , and let Gn be...

متن کامل

Dynamic Pricing under Finite Space Demand Uncertainty: A Multi-Armed Bandit with Dependent Arms

We consider a dynamic pricing problem under unknown demand models. In this problem a seller offers prices to a stream of customers and observes either success or failure in each sale attempt. The underlying demand model is unknown to the seller and can take one of N possible forms. In this paper, we show that this problem can be formulated as a multi-armed bandit with dependent arms. We propose...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: Stochastic Processes and their Applications

سال: 2022

ISSN: ['1879-209X', '0304-4149']

DOI: https://doi.org/10.1016/j.spa.2022.03.010